A Comparison on How Statistical Tests Deal with Concept Drifts
نویسندگان
چکیده
RCD is a framework proposed to deal with recurring concept drifts. It stores classifiers together with a sample of data used to train them. If a concept drift occurs, RCD tests all the stored samples with a sample of actual data, trying to verify if this is a new context or an old one that is recurring. This is performed by a non-parametric multivariate statistical test to make the verification. This paper describes how two statistical tests (KNN and Cramer) can distinguish between new and old contexts. RCD is tested with several base classifiers, in environments with different rates-of-change values, with gradual and abrupt concept drifts. Results show that RCD improves single classifiers accuracy independently of the statistical test used.
منابع مشابه
Concept drift detection in event logs using statistical information of variants
In recent years, business process management (BPM) has been highly regarded as an improvement in the efficiency and effectiveness of organizations. Extracting and analyzing information on business processes is an important part of this structure. But these processes are not sustainable over time and may change for a variety of reasons, such as the environment and human resources. These changes ...
متن کاملConcept Drift Detection with Hierarchical Hypothesis Testing | Proceedings of the 2017 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics
When using statistical models (such as a classifier) in a streaming environment, there is often a need to detect and adapt to concept drifts to mitigate any deterioration in the model’s predictive performance over time. Unfortunately, the ability of popular concept drift approaches in detecting these drifts in the relationship of the response and predictor variable is often dependent on the dis...
متن کاملConcept Drift Detection and Adaptation with Hierarchical Hypothesis Testing
In a streaming environment, there is often a need for statistical prediction models to detect and adapt to concept drifts (i.e., changes in the underlying relationship between the response and predictor data streams being modeled) so as to mitigate deteriorating predictive performance over time. Various concept drift detection approaches have been proposed in the past decades. However, they do ...
متن کاملFast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift
The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically desig...
متن کاملA Simple Unlearning Framework for Online Learning Under Concept Drifts
Real-world online learning applications often face data coming from changing target functions or distributions. Such changes, called the concept drift, degrade the performance of traditional online learning algorithms. Thus, many existing works focus on detecting concept drift based on statistical evidence. Other works use sliding window or similar mechanisms to select the data that closely ref...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012